Template Specification
JSON Template Structure
Template Descriptor
property | type | required | description |
---|---|---|---|
meta | TemplateMeta | Template metadata consisting of a name and description. Note: available from template schema version 1.1.0 | |
templateSchemaVersion | version | Template schema version in major.minor.patch format, where major, minor, patch are non-negative integers. For details see section below. | |
processed | boolean | If true indicates that the template is processed, i.e. all properties has been already extracted from PDF and shall be specified inside the template. If false indicates that there might be some properties which requires extraction from PDF. | |
dataFields | list(DataField) | The list of specified data fields. |
TemplateMeta Descriptor
property | type | required | description |
---|---|---|---|
name | String | Template name. | |
description | String | Template description. |
DataField Descriptor
property | type | required | description |
---|---|---|---|
name | string | The name of the data field. Shall be unique between all data fields in the current template json. | |
dataType | string | The type of Data Field. Note: available from template schema version 1.1.0 | |
referencePdfOrigin | PdfLocation | The data which describes the location of the data field inside the referenced PDF file. For complete reference, the PDF file shall be specified. The specifying of PDF reference is described in Meta of the Template Archive Specification. Required if the data field contains unprocessed selectors. More info can be found in Template Archive File Specification. | |
multipleAllowed | boolean | Specified whether the data field may match more than one occurrence. If false and more than one match found, then SDK shall report an error. If missing shall be treated as false. | |
zeroAllowed | boolean | Specified whether the data field may match zero occurrences. If false and no match found, then SDK shall report an error. If missing shall be treated as false. | |
selectors | list(Selector) | The list of selectors. |
PdfLocation Descriptor
The PDF Location is a rectangle (specified by its coordinates) on the page.
property | type | required | description |
---|---|---|---|
page | integer | The 1-based page index of the specified PDF location. | |
location | Rectangle | The rectangle which specified location on the page in PDF page's space. |
Rectangle Descriptor
left | double | The left coordinate of the rectangle (lower horizontal value). | |
---|---|---|---|
right | double | The right coordinate of the rectangle (upper horizontal value). | |
top | double | The top coordinate of the rectangle (upper vertical value). | |
bottom | double | The bottom coordinate of the rectangle (lower vertical value). |
Selector Descriptor
Selector Descriptor specification can be found in Selector Specification.
Template Examples
Full template.json example
Click to expand json
{
"meta": {
"name": "name",
"description": "description"
},
"templateSchemaVersion": "1.1.0",
"processed": true,
"dataFields": [
{
"name": "Data field name",
"dataType": "dataType",
"referencePdfOrigin": {
"page": 1,
"location": {
"left": 11.0,
"right": 111.0,
"top": 111.0,
"bottom": 11.0
}
},
"multipleAllowed": false,
"zeroAllowed": false,
"selectors": [
{
"selectorType": "iban"
}
]
}
]
}
JSON Template Schema Version
JSON Template schema version consists of three numbers: major, minor and patch. On any schema update we expect that schema version would be bumped.
Bumping rules:
- patch: we bump patch version if the introduced changes are refers to the completely new feature and does not break backward compatibility. This means that the new template produces the same result for older SDK versions. Also parsing the result in new SDK version, with using the same approach as for old SDK (ignoring the new feature), produces the same result. Examples: adding new parameters into the template schema which are affecting the result structure for new SDK version. The user should not be worried about using template file with non matching patch version. Also the file with lower patch version should be available for safe update to the upper patch version by simple version string update.
- minor: we bump minor version if parsing compatibility has not been broken, but SDK release contains any changes related to the recognition results. This means that the new feature may be normally parsed by the previous SDK (either ignored, or has non error parsing logic), but may produce different results in older and new SDK in terms of old result structure parsing. Examples: template schema fields' meaning changes which affects result values, introducing new selectors or additional selector's parameters, etc. The user should be warned (warning log) if he uses the template with non matching minor version. Also the file with lower minor version should be available for update to the upper patch version by simple version string update (assuming acknowledged warning).
- major: we bump major version if parsing compatibility has been broken. The new SDK release may have an ability to work with old major version as well (with a warning), but an old SDK should not work with new major template version as the result might be unpredictable. This means that potentially some of the old features in an old SDK may not work properly with the new template schema or event produce parsing exception later after version validation. Example: parameters renaming or data types changes, introducing new structures, etc. The SDK may either throw an exception on non matching major version parsing, or warn (warning log) the user (if it can parse old major version without explicitly converted template). Also for each major version upgrade, there might be an API in SDK to convert the old major versioned template into the new one.
For the compatibility matrix of JSON Template Schema versions in SDK's see JSON Template Schema Version support.